ABSTRACT
In this paper, we present the ULD-NUIG team's system, designed as part of Social Media Mining for Health Applications (#SMM4H) Shared Task 2021. We participate in two tasks out of eight, namely "Classification of tweets self-reporting potential cases of COVID-19" (Task 5) and "Classification of COVID19 tweets containing symptoms" (Task 6). The team conduct a series of experiments to explore the challenges of both the tasks. We used a multilingual pre-trained BERT model for Task 5 and Generative Morphemes with Attention (GenMA) model for Task 6. In the experiments, we find that, GenMA, developed for Task 6, gives better results on both validation and test data-set. The submitted systems achieve F-1 score 0.53 for Task 5 and 0.84 for Task 6 on test data-set. © 2021 Association for Computational Linguistics.
ABSTRACT
The expeditious growth of technology with social media as a platform for communication has led to a proliferous increase in the spread of misinformation and fake news. The ongoing COVID-19 widespread has pushed us to review posts on various social media platforms to stop people from being subjected to false and perilous posts. Detecting fake news in social media has been the need of an hour. The proposed research work has approached it with various Transformer and recurrent models with several contextual word embedding models. Furthermore, the effectiveness of the proposed model is evaluated by using a different loss function instead of the conventional loss function, Binary cross Entropy. The fake news detection is considered as a sequence classification task, one of the downstream tasks of natural language processing. It has been observed that using domain-specific language models along with custom loss function has achieved the highest weighted average F1-score. © 2021 IEEE.